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ABSTRACT  " 

Kaneko  proposed  a method  of  variable  thresholding  in 
which  an  image  is  divided  into  windows;  thresholds  are 
selected  for  those  windows  that  have  bimodal  histograms; 
and  these  thresholds  are  interpolated  to  define  a variable 
threshold  for  the  entire  image.  This  method  was  applied  to 
several  TV  images  of  machine  parts;  the  results  obtained 
appeared  to  be  considerably  better  than  the  results  of 
thresholding  at  a fixed  level.  An  extension  of  the  method 
was  defined  that  allowed  histograms  to  be  either  bimodal  or 
trimodal;  this  yielded  some  further  improvement  in  the  re- 
sults, but  was  also  more  sensitive  to  shadows.  Finally,  an 
adaptive  quantization  scheme,  based  on  histogram  peak 
sharpening,  was  applied  to  two  of  the  images;  the  results 
do  not  seem  to  be  as  good  as  those  obtained  using  variable 
thresholding.  Some  results  for  FLIR  images  of  tactical  tar- 
gets are  also  presented. 
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1.  Introduction 

In  [1]  Chow  and  Kaneko  proposed  a method  of  variable 
thresholding  for  image  segmentation.  In  this  method,  the 
image  is  divided  into  windows;  a gray  level  histogram  is  com- 
puted for  each  window;  and  thresholds  are  selected  for  those 
windows  that  have  bimodal  histograms.  (A  bimodal  histogram 
indicates  that  the  gray  level  population  in  the  given  window 
is  a mixture  of  the  subpopulations;  by  choosing  a threshold 
that  separates  the  two  peaks,  we  can  discriminate  between 
these  populations.)  These  thresholds  are  then  interpolated 
(or  extrapolated)  to  define  a variable  threshold  for  the  en- 
tire image.  This  method  is  particularly  useful  in  cases 
where  there  is  a large  range  of  variation  in  gray  scale  from 
one  part  of  the  image  to  another,  so  that  a single  fixed 
threshold  cannot  be  used  for  the  entire  image.  Chow  and 
Kaneko  successfully  applied  this  method  to  detect  the  heart 
region  on  chest  x-rays. 

This  report  describes  an  application  of  the  Chow-Kaneko 
approach  to  TV  images  of  machine  parts.  These  images  were 
obtained  from  General  Motors  Research  Laboratory;  they  are 
part  of  a data  base  described  in  [2] . Some  experiments  were 
also  performed  using  Forward  Looking  InfraRed  (FLIR)  images 
of  tactical  targets;  these  are  described  in  Section  5. 

The  TV  images  used  are  shown  in  Figure  1.  Parts  (a)  and 
(b)  each  show  four  nonoverlapping  "connecting  rods",  but  (b) 
has  more  uneven  illumination  than  (a) . Parts  (c)  and  (d) 


show  cases  involving  overlap  and  shadows;  these  conditions 
are  much  more  severe  in  (d)  than  in  (c) . 

The  version  of  Chow  and  Kaneko's  method  used  is  de- 
scribed in  Section  2 , and  the  results  obtained  are  compared 
with  the  results  of  using  a single  fixed  threshold  for  the 
entire  image.  The  variable-threshold  results  are  considerably 
better.  An  extension  of  the  Chow-Kaneko  approach,  described 
in  Section  3,  took  into  consideration  both  bimodal  and  tri- 
modal  histograms,  and  allowed  either  one  or  two  thresholds  to 
be  chosen  for  each  window;  this  yielded  some  further  improve- 
ment in  the  results,  but  was  also  more  sensitive  to  shadows. 
Finally,  an  adaptive  quantization  scheme,  based  on  histogram 
peak  sharpening  [3-4] , was  applied  to  the  images;  the  results 
(Section  4)  do  not  seem  to  be  as  good  as  those  obtained  using 
variable  thresholding.  Section  5 shows  results  for  a set  of 
FLIR  images. 


2 . Variable  thresholding  based  on  local  bimodality 

The  first  step  in  the  Chow/Kaneko  method  is  to  divide 
the  image  into  windows.  The  windows  used  for  the  images  of 
Figure  1 are  indicated  by  grid  boxes.  In  this  case  the  images 
were  256x243  pixels  each;  we  used  windows  of  size  32x32  (the 
bottom  three  rows  of  each  image  were  ignored) , so  that  each 
image  was  divided  into  64  windows. 

The  method  used  to  determine  bimodality  and  to  select 
thresholds  for  the  bimodal  histograms  was  based  on  a process 
of  Gaussian  fitting.  This  method  consisted  of  the  following 
steps,  which  were  carried  out  for  each  of  the  window  histo- 
grams : 

a)  The  mean  and  standard  deviation  of  the  histogram  are 
computed.  These  are  defined  by 


M = ^ I F (i ) - i 


where  F(i)  is  the  histogram  value  for  gray  level  i 
(i.e.,  the  number  of  window  points  having  gray  level 
i) , and  N is  the  number  of  points  in  the  window 
(=  960  in  our  case).  [In  our  images,  there  were  32 
gray  levels,  so  that  all  sums  were  taken  over  the 
range  0 * i a.  31.]  If  o * 3,  no  threshold  was  com- 
puted for  the  histogram.  If  a > 3,  the  threshold 
computation  was  carried  out  as  described  next. 


b)  A least-squares  fit  of 


f (i)  = e“  (i-yi) 2/2o12  + ^2  e-(i-y2)2/2a22 


to  the  histogram  F(i)  is  found  by  adjusting  the 
parameters  P^ , o^,  P2 , Mj'  and  °2*  This  is 
done  as  follows: 


bl)  The  histogram  is  smoothed  by  taking  a local  weighted 
average : 

F.(i)  = F (i-2) +2F (i-l)+3F(i)+2F(i+l)+F(i+2) 


On  the  smoothed  histogram,  the  deepest  valley  (= 
lowest  value)  is  found,  and  is  used  to  divide  the 
histogram  into  two  parts.  Initial  estimates  of 
o^,  P2,  y2'  °2  are  computed  on  these  two 
parts  (for  the  original  histogram  F(i))  as  follows: 


N,  = l F (i) 
i=0 
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b2)  A hill-climbing  method  is  used  to  minimize  5!  [f  (i)  -F  (i)  ] ^ , 

i=0 

using  the  program  FMCG,  which  is  available  in  the  IBM 
System/360  Scientific  Subroutine  Package.  In  this 
program,  the  following  parameters  were  used: 

EST  (estimate  of  the  minimum  function  value)  = 1.0 
EPS  (expected  absolute  error)  = 0.1 
LIMIT  (maximum  number  of  iterations)  = 30 
c)  The  resultant  best-fitting  f(i)  is  tested  for  bi- 
modality, based  on  the  criteria 

'Vh  ’ 4 

0.1  < a1/°2  <1*00 

<$12  < 0.8 

where  <$^2,  the  valley- to-peak  ratio,  is  defined  by 


Minimum  value  of  f in  [y^,y2] 
Min [f (y^) ,f (y2) ] 


If  these  tests  are  not  satisfied,  no  threshold  is 
selected  for  that  window.  If  they  are  satisfied,  a 
threshold  is  selected  that  minimizes  the  probability 


of  misclassif ication  for  the  mixture  distribution 
f(i).  This  threshold  t satisfies 
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After  thresholds  have  been  chosen  in  this  way  for 
the  bimodal  windows,  thresholds  are  then  defined  for  the 

other  windows  by  a local  weighted  averaging  process.  Specif i- 

/ 

cally,  let  T(u,v)  be  the  threshold  assigned  to  the  window 

centered  at  (u,v) , or  0 if  no  threshold  was  assigned  to  that 

> 

window.  Then  for  a window,  say  centered  at  (x,y) , for  which 
no  threshold  has  yet  been  assigned,  we  compute  the  threshold 

(T (x+1 ,y)  + T (x-1 ,y)  + T(x,y+1)  + T(x,y-1) 

+ ^(T(x+l,y+l)  + T (x+1  ,y-l)  + T(x-l,y+l)  + T(x-l,y-l)) 

provided  at  least  one  of  these  neighboring  windows,  other  than 
a diagonal  neighbor,  has  had  a threshold  assigned.  We  then 
smooth  the  resulting  array  of  thresholds  by  local  weighted 
averaging  using  the  array  of  weights 


l , 1 

/I  1 

12  1 

1 1 
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Finally,  we  assign  thresholds  to  individual  points 


by  bilinear  interpolation  on  the  window  thresholds.  Let  P be 


surrounded  by  the  four  window  centers  A,  B,  C,  D as  shown  be- 
low, where  the  thresholds  for  these  windows  are  T^,  Tg,  Tc, 
and  Td,  respectively.  n 


Then  the  threshold  for  P is  taken  to  be 

P (a+b) (c+d) [bdTA+bcTB+daTc+catD] 

If  P is  not  surrounded  by  four  window  centers  (i.e.,  it  lies 
near  the  border  of  the  image) , its  threshold  is  taken  to  be 
that  at  the  nearest  window  center. 

Figure  2 shows  the  window  histograms  for  the  four  pic- 
tures in  Figure  1.  [All  of  these  were  scaled  so  the  tallest 
peak  has  a fixed  height;  the  vertical  lines  provide  an  indica 
tion  of  the  scale  used  on  each  histogram  (each  line  repre- 
sents 50  pixels).]  Figure  3 shows  the  two-Gaussian  approxima 
tions  to  the  histograms  for  those  winodws  that  were  judged 
to  be  bimodal.  The  point  thresholds  obtained  by  bilinear  in- 
terpolation are  displayed  in  Figure  4.  (The  horizontal  lines 
in  Figure  2 showed  the  thresholds  at  the  window  centers, 
superimposed  on  the  window  histograms . ) Figure  5 shows  the 
results  of  applying  these  point  thresholds  to  the  pictures  in 


Figure  1.  For  comparison,  the  results  of  using  a fixed 
threshold  (chosen  by  fitting  two  Gaussians  to  the  histograms 
of  the  entire  images)  are  shown  in  Figure  6 . They  are  much 
worse  in  the  first  three  cases,  but  (surprisingly)  somewhat 
better  in  the  fourth  case,  in  some  parts  of  the  image. 


3. 


plemented  in  which  the  local  histograms  were  tested  for  both 
bimodality  and  trimodality.  The  latter  test  involves  least- 
squares  fitting  of 


P3  -(i-y3)2/2a32  P4  -(i-M4)2/2a42  P5  -(i-y5)/2a5: 
g(i)  = — e + 5—  e +5~e 

3 4 5 


to  the  histogram  F(i).  The  initial  values  of  the  nine  para- 
meters (P's,  y's,  and  a's)  are  obtained  by  dividing  the  smoothed 
histogram  at  its  deepest  and  second-deepest  valleys.  The 
hill-climbing  iteration  process  was  the  same  as  described  in 
Section  2. 

The  bimodality  test  ((c)  in  Section  2)  was  applied  to  the 
two-Gaussian  fits,  and  the  same  test  was  also  applied  to  each 
pair  of  Gaussians  ((3,4)  and  (4,5))  in  the  three-Gaussian 
fits.  If  both  the  two-  and  three-Gaussian  fits  satisfied 
these  tests  for  a given  window,  the  three-Gaussian  fit  was 
used. 

Let  t^  be  the  minimum-error  threshold  for  the  two- 
Gaussian  fit,  and  let  t34  and  t4^  be  the  minimum-error 
thresholds  for  the  pairs  of  Gaussians  (3,4)  and  (4,5).  Let  ~34 
and  t45  be  the  averages  of  these  latter  thresholds  in  those 
neighboring  windows  for  which  three-Gaussian  fits  were  used. 

For  each  window  at  which  the  two-Gaussian  fit  was  chosen,  a 
decision  was  made  as  to  whether  the  resulting  threshold  t12 


should  correspond  to  the  higher  or  lower  threshold  of  a three- 
Gaussian  fit.  Specifically,  if  | 5 ~ t^ 2 I > |t-,d-t19|  anci 


'34  w12 


|t34~t12|  < 4,  t12  was  regarded  as  a t34  threshold;  if 
I ^34-ti2 1 * I ^45-1^12 1 and  |t45-t12l  < 4»  t12  was  regarded  as 

a t...  threshold. 

45 

Thresholds  were  then  interpolated  for  the  remaining 


windows  (at  which  neither  two-  nor  three-Gaussian  fits  were 


obtained),  and  for  the  individual  image  points,  exactly  as  in 
Section  2 . 


Figure  7 shows  the  two-  and  three-Gaussian  approximations 
(whichever  was  chosen  in  each  case)  to  the  window  histograms. 


The  t^4  and  t4^  interpolated  point  thresholds  are  shown  in 
Figures  8 and  9,  and  the  window  histograms  with  the  two 
thresholds  used  at  the  window  centers  superimposed  are  shown 
in  Figure  10.  Figure  11  shows  the  thresholded  images,  with 
the  three  gray  level  ranges  displayed  as  black,  gray,  and 
white.  These  seem  somewhat  more  useful  than  the  two-Gaussian 
thresholded  images  of  Figure  5,  but  (in  the  third  and  fourth 
parts)  they  are  seen  to  be  quite  sensitive  to  shadows. 


4 . Comparison  with  iterative  histogram  modification 

A possible  alternative  to  variable  thresholding  might  be 
to  use  an  adaptive  requantization  scheme  which  identifies 

peaks  on  a histogram  and  replaces  each  peak  by  a spike  (i.e.,  i 

it  maps  all  gray  levels  belonging  to  the  given  peak  into  a 
single  gray  level).  Such  a scheme  is  described  in  [3,  4]; 
it  iteratively  enhances  maxima  on  a (smoothed)  histogram 

J 

until  the  major  peaks  become  spikes.  By  increasing  the  amount 
of  smoothing  used,  one  can  reduce  the  number  of  spikes  that 
result,  since  fewer  peaks  will  be  regarded  as  "major".  This 
approach  does  not  require  the  number  of  peaks  to  be  speci- 
fied (e.g.,  2 or  3,  as  in  the  present  study);  the  shape  of 
the  histogram  determines  the  number  of  peaks  that  are  de- 
tected at  a given  degree  of  smoothing. 

Figure  12  illustrates  the  application  of  this  method  to 
two  of  the  pictures  (a  and  c)  in  Figure  1,  for  various 
amounts  (W)  of  smoothing.  [The  histograms  of  the  original 
pictures,  and  the  resulting  spike  histogram  for  each  value  of 
W,  are  shown  in  Figure  13.]  For  the  first  picture,  the  re- 
sults for  two-  and  three-spike  histograms  are  reasonable, 
but  are  not  as  good  as  those  obtained  in  Section  2 and  3. 

For  the  second  picture,  the  three-spike  result  is  somewhat 
affected  by  shadows  and  suffers  from  loss  of  internal  detail. 

It  appears  that  adaptive  quantization  is  not  as  good  as 
variable  thresholding  for  these  types  of  pictures.  This  may 
be  because  the  adaptive  quantization  approach  is  best  suited 


k 


I 

I 

for  histograms  that  are  mixtures  of  sharply  unimodal  distri- 
butions; in  an  unevenly  illuminated  picture,  on  the  other 
> hand,  the  gray  level  distribution  from  a uniformly  reflective 

region  is  broad  and  flat-topped,  rather  than  sharply  peaked. 


5 . Results  for  FLIR  images 


Figure  14  shows  a set  of  four  Forward  Looking  InfraRed 
(FLIR)  images  containing  warm  objects  on  a cooler  background. 
These  pictures  were  divided  into  32x32  pixel  windows  (as  in- 
dicated by  the  grids) , and  were  processed  essentially  as  in 
Section  2,  with  the  following  exceptions. 

a)  The  criterion  used  for  picking  windows  with  large 
standard  deviations  was  o ^2.4,  rather  than  a >3; 
even  with  this  smaller  threshold,  very  few  windows 
were  chosen. 

b)  The  initial  estimates  of  the  parameters  in  the  two- 
Gaussian  approximations  were  chosen  on  the  original 
(unsmoothed)  window  histograms;  these  histograms 
were  judged  to  be  relatively  smooth,  so  that  further 
smoothing  did  not  seem  to  be  necessary. 

c)  The  bimodality  criteria  used  were 

y2-yl  > 3;  0,1  °l/,°2  < 10,0  ' 512  < 1,0 

In  spite  of  these  more  tolerant  criteria  on  p and  6 , 

very  few  windows  were  judged  to  be  bimodal. 

Figure  15  shows  the  window  histograms  for  the  four  FLIR 
images;  Figure  16  shows  the  two-Gaussian  approximations  to 
the  histograms  for  those  (few)  windows  that  were  judged 
bimodal.  The  interpolated  point  thresholds  for  these  pictures 
are  not  shown,  because  they  have  almost  no  variability  . (The 
horizontal  lines  superimposed  on  the  histograms  in  Figure  15 


show  the  thresholds  at  the  centers  of  the  windows.)  The  re- 
sults of  applying  these  thresholds  to  the  pictures  are  shown 
in  Figure  17.  For  comparison,  the  results  of  using  a fixed 
threshold  for  each  picture  are  shown  in  Figure  18 . This 
threshold  was  chosen  using  the  (gray  level,  edge  value) 
clustering  scheme  described  in  [5] . The  variable-thresholding 
results  seem  to  be  less  noisy  for  three  out  of  the  four  pic- 
tures, but  are  noisier  for  picture  (b) . 

The  three-Gaussian  approximation  method  was  also  applied 
to  the  FLIR  images,  but  almost  no  windows  were  found  to  be 
trimodal.  The  results  of  this  method  are  therefore  not  shown 


here. 


6 . Concluding. remarks 


Chow  and  Kaneko's  method  of  variable  thresholding  based 
on  local  bimodality  detection,  which  was  developed  for  use 
in  angiography,  appears  to  be  useful  in  other  applications 
as  well.  An  extension  to  allow  for  the  possibility  of  tri- 
modality yielded  relatively  little  improvement. 
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Original  pictures,  with  grids 
superimposed  showing  windows 
over  which  local  histograms 
were  computed . 
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Figure  2.  Window  histograms  for  the 
pictures  in  Figure  1. 


Point  thresholds , obtained 
by  interpolation  on  the 
window  thresholds,  for  the 
four  pictures. 
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Results  of  applying  fixed 
thresholds  (obtained  by 
fitting  two  Gaussians  to 
the  histograms  of  the  entire 
pictures)  to  the  pictures  of 
Figure  1. 


Two-  and  three-Gaussian 
approximations  to  those 
window  histograms  in 
Figure  2 that  were  judged 
to  be  bi-  and  tri-modal. 
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Figure  8.  Darker  thresholds  obtained  for  the 
four  pictures  by  interpolating  on 
the  window  thresholds. 
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Figure  9 
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Three-level  pictures  obtained  by 
applying  the  thresholds  shown  in 
Figures  8 and  9 to  the  pictures 
of  Figure  1. 
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Figure  15.  Window  histograms  for  the 
pictures  in  Figure  14. 
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Two-Gaussian  approximations 
to  those  window  histograms 
in  Figure  15  that  were  judged 
to  be  bimodal . 
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suits  of  applying  the 
terpolated  point  thresholds 
the  pictures  of  Figure  14. 
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